fill  GOP/  ADA099987 


NBDL.-M001 


STABILIZATION  AND  TASK  DEFINITION  IN  A  PERFORMANCE  TEST  BATTERY 


Marshall  B.  Jones 


October  1980 


NAVAL  BIODYNAMICS  LABORATORY 
New  Orleans,  Louisiana 


Approved  for  public  release.  Distribution  unlimited, 

81  6  09  027 


f 


'( 

• .  Ll 


s 


i  /■- 


UNCI. 


.AS^lknfo  ‘ 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  fWh.n  Dmte  Entered) 


,rr 


) 


I  i  ) 


,  f  ■  /  REPORT  DOCUMENTATION  PAGE 

I.  report  nu m b e n 


NBDL ' -MPjrfl  ^ 


2.  GOVT  ACCESSION  NO. 

/4aao W  2 


7 


4.  Tl'lLE  (and  Subtitle) 

Stabilization  and  Task  Definition  in  a 
Performance  Test  battery., 

y 


7.  authorc.; 


Marshall  B.l  Jones 


9.  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

Department  of  Behavioral  Sciences 

Milton  S.  Hershey  Medical  Center,  Penn  St.  Univ. 

Hershey,  PA  17013 


11.  CONTROLLING  OFFICE  NAME  AND  ADDRESS 

Naval  Medical  Research  &  Development  Command 
Bethesda,  MD  20014 


U.  MONITORING  AGENCY  NAME  6  ADDRESSC/f  dllleret ll  troni  Cnnlrolllng  OJ'Ict.l 


Naval  Biodynamics  Laboratory- 
Box  29407 


New  Orleans,  LA  70189 


V 


16.  DISTRIBUTION  STATEMENT  (ol  thle  Report) 


-L 


-READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


3.  RECIPIENT’S  CATALOG  NUMBER 


3.  TYPE  OF  REPORT  &  PERIOD  COVERED  J 

Monograph 


6.  PERFORMING  ORO.  REPORT  NUMBER 


8.  CONTRACT  OR  GRANT  NUMBER^) 

J. 

LGonJ:ract  No. 
N0Cfe3-79-M-5pa9  ‘  '  ' 


10.  PROGRAM  ELEMENT,  PROJECT,  TASK 
AREA  i>  WORK  UNIT  NUMBERS 

Project  No.  F58524 
Task  Area  vZE5fl5240& 

Work  Un&L,'/MF58  524  002/50271 


May  12,  19?/ 


U.  REPORT  D 


2f 


.002/! 

ZP7] 


m 


13.  NUMBER  OF  PAGES 

59 


IS.  SECURITY  CLASS,  (ol  thle  f.pori; 


Unclassified _ _ _ _ 

"|5«.  DECLASSI  FI  CATIOn7dOWN  GRADING 
SCHEDULE 


Approved  for  public,  release,  distribution  unlimited 


17.  DISTRIBUTION  STATEMENT  (ol  the  ebetrect  entered  In  Block  30,  11  different  from  Report)  ty.jr*  ^ 

_  V'*V" 

c  y  ..,\o 


18.  SUPPLEMENTARY  notes 


Final  Import  on  Navy  Contract  No.  N0023-79-M-5089 


19.  KEY  WORDS  fConlfrtuo  cm  r«v«r*»  mldo  1/  nmcaaaary  and  IdanlUy  by  block  num6*rj 

Performance,  Skill  Acquisition,  Task  Stabilization,  Individual  Differences, 
Performance.  Testing,  Environmental  Stress 

/ 

/ 


20. ABSTRACT  ('Uontinu*  on  rmvaraa  il dm  If  nacaaaary  and  Idantlty  by  block  numbat) 

I'  Performance  testing  under  unusual  environmental,  circumstances  almost  always^ 
involves  repeated-measure  designs.  Most  tasks,  however,  show  practice  effects 
with  repeated  administrations,  effects  that  may  appear  in  the  group  mean,  the 
variance  among  subjects,  or  the  correlations  over  subjects  among  trials  or  re¬ 
peated  testings.  There,  comes  a  point  in  many  tasks  after  which  practice  no 
longer  produces  changes  in  performance;  as  we  will  put  it,  the  task  stabilizes . 
Our  criteria  for  stabilization  are:  the  group  mean  no  lohger  increases,  or  in¬ 
creases  at  a  slow  and  regular  rate;  the  variance  among  subjects  no  longer 


DD  ,  jan ^73  1473 


EDITION  OF  1  NOV  86  I!  OBSOLETE 
S/N  010  J-01V  6601  i 


Unclassified 


SECURITY  CLASSIFICATION  OF  THIS  WAGE  flfTiOfl  -Aid  Entered) 


h 


_ I  Inclasslfied _ 

■,1-UljHITY  CLASSIFICATION  OF  THIS  PAGE('H'h»n  Dele  Entered)  _  _ 

"•changes;  and  the  correlation  with  earlier  trials  remains  the  same  from  one 
stabilized  trial  to  the  next;  finally,  the  correlation  among  stabilized  trials 
is  constant.  Stabilization  in  this  sense  is  virtually  essential  to  a  perform- ■ 
ance  test  battery.  When  it  is  absent,  practice  and  environmental  effects  are 
confounded;  interpretation  becomes  very  difficult;  and  problems  of  design  are 
greatly  complicated,  in  some  cases  impossibly  so.  It  is  desirable,  therefore, 
to  determine  in  advance  whether  or  not  a  task  stabilizes  with  practice  and,  if 
so,  how  long  it  takes.  It  is  additionally  desirable  that  a  task  be  well  de¬ 
fined,  that  is,  that  it  stabilize  at  a  high  level;  preferably  the  average 
correlation  among  stabilized  trials  should  be  greater  than  .90.  The  present 
report  concerns  ten  tasks  each  of  which  was  practiced  for  15  days  by  either 
■  8  or  19  subjects  ,p\  The  ten  tasks  were:  Complex  Counting,  Grammatical 
Reasoning,  Code  Substitution,  Stroop  Color-Words,  Arithmetic,  Letter  Search, 
Critical  Tracking,  Compensatory  Tracking,  Time  Estimation,  and  Spoke  Trail 
Making.  The  ten  tasKs  were  practiced  in  the  order  given.  All  subjects  were 
Navy  enlisted  men  between  1.9  and  24  years  old  and  with  20/20  vision.  The  ten 
tasks  were  all  analyzed  according  to  the  criteria  mentioned  above,  beginning 
with  the  mean  and  variance  and  then  determining  the  stability  of  cross  session 
reliabilities.  Analyses  of  the  ten  tasks  was  straightforward  and  according  to 
the  criteria  mentioned \above  with  respect  to  the  means  and  standard  deviations. 
In  order  to  determine  the  suability  of  the  correlation  among  trials,  a  series 
of  two  way  ANOVA's  is  applied  to  the  correlation  matrices.  Each  of  the  ten 
tasks  was  subjected  to  this  same  step-wise  analysis.  Some  tasks,  for  example, 
Arithmetic,  stabilized  completely.  Others  staoilized  in  some  respects  but 
not  in  others.  Some  tasks  had  more  than  one  dependent  measure  and  in  these 
cases  stabilization  sometimes  occurred  in  one  dependent  measure  when  it  did 
not  occur  in  another.  The  bulk  of  the  report  is  given  over  to  detailing  these 
results  and  describing  the  application  of  the  ANOVA  employed. 


Unclassified 


SECURITY  CLASSIFICATION  OF  THI  PAOEfWmn  J5«l«  Entered) 


NBDL  -  M001 


STABILIZATION  AND  TASK  DEFINITION  IN  A  PERFORMANCE  TEST  BATTERY 


Marshall  B.  Jones 


October  1980 


Bureau  of  Medicine  and  Surgery 
Work  Unit  MF58 . 524 .002-5027 
Navy  Contract  N0023-79-M-5089 


Api  oved  by 


Released  by 


Channing  L.  Ewing,  M,  D. 
Scientific  Director 


Captain  J.  E.  Wenger  MC  USN 
Commanding  Officer 


Naval  Biodynamics  Laboratory 
Box  29407 

New  Orleans,  LA  701 89 

Opinions  or  conclusions  contained  in  this  report  are  those  of  the  author(s)  and  do  not 
necessarily  reflect  the  views  or  the  endorsement  of  the  Deportment  of  the  Navy. 

Approved  for  public  release;  distribution  unlimited. 


ABSTRACT 


Performance  testing  under  unusual  environmental  circumstances  almost 
always  involves  repeated-measure  designs.  Most  tasks,  however,  show  practice 
effects  with  repeated  administrations,  effects  that  may  appear  in  the  group 
mean,  the  variance  among  subjects,  or  the  correlations  over  subjects  among 
trials  or  repeated  testings.  There  chines  a  point  in  many  tasks  after  which 
practice  no  longer  produces  changes  in  performance;  as  we  will  put  it,  the 
task  stabilizes .  Our  criteria  for  stabilization  are:  the  group  mean  no  longer 
increases,  or  increases  at  a  slow  and  regular  rate;  the  variance  among  subjects 
no  longer  changes;  and  the  correlation  with  earlier  trials  remains  the  same 
from  one  stabilized  trial  to  the  next;  finally,  the  correlation  among  stabilized 
trials  is  constant.  Stabilization  in  this  sense  is  virtually  essential  to  a 
performance  test  battery.  When  it  is  absent,  practice  and  environmental  effects 
are  confounded;  interpretations  becomes  very  difficult;  and  problems  of  design 
are  greatly  complicated,  in  some  cases  impossibly  so.  It  Is  desirable,  there¬ 
fore,  to  determine  in  advance  whether  or  not  a  task  stabilizes  with  practice 
and,  if  so,  how  long  It  takes.  It  is  additionally  desirable  that  a  task  oe 
well  defined ,  that  Is,  that,  it  stabilize  at  a  high  level;  preferably  the  average 
correlation  among  stabilized  trials  should  be  greater  than  .90.  The  present  re¬ 
port  concerns  ten  tasks  each  of  which  was  practiced  for  IS  days  by  either  18  or 
19  subjects.  The  ten  tasks  were:  Complex  Counting,  Grammatical  Reasoning,  Code 
Substitution,  Stroop  Color-Words,  Arithmetic,  Letter  Search,  Critical  Tracking, 
Compensatory  Tracking,  Time  Estimation,  and  Spoke.  Trail.  Making.  The  ten  tasks 
were  practiced  In  the  order  given.  All  subjects  were  Navy  enlisted  men  between 
19  and  24  years  old  and  with  20/20  vision.  The  ten  tasks  were  all  analyzed 
according  to  the  criteria  mentioned  above,  beginning  with  the  mean  and  variance 
and  then  determining  the  stability  of  cross  session  reliabilities.  Analysis  of 
the  ten  tasks  was  straightforward  and  according  to  the  criteria  mentioned  above 


11 


witn  respect  to  the  means  and  standard  deviations.  In  order  to  determine  the. 
stability  of  the  correlation  among  trials,  a  series  of  two  way  ANOVA’s  is 
applied  to  the  correlation  matrices.  Each  cf  the  ten  tasks  was  subjected  to 
this  same  step-wise  analysis.  Some  tasks,  for  example,  .arithmetic,  stabilized 
completely.  Others  in  some  respects  but  not  in  others.  Some  tasks  had  more 
than  one  dependent  measure  and  in  these  cases,  stabilization  sometimes  occurred 
in  one  dependent  measure  when  it  did  not  occur  in  another.  The  bulk  of  the 
report  is  given  over  to  detailing  these  result's  and  describing  the  application 
of  the  ANOVA  employed. 

Marshall  B.  Jones  is  with  the  Department  of  Behavioral  Sciences,  Milton  S. 
Hershey  Medical  Cehter,  Pennsylvania  State  University,  Hershey,  PA  17033, 

Pinal  Report,  on  Navy  Contract  No.  N0023-79-M-5089 ,  May  12,  1979. 

Trade  names  of  materials  or  products  of  commercial  or  nongovernment  organiza¬ 
tions  arc  cited  only  where  essential  to  precision  in  describing  research  pro¬ 
cedures  or  evaluation  of  results.  Their  use  does  not  constitute  official 
endorsement  or  approval  of  the  use  of  suc.h  commercial  hardware  or  software. 

All  volunteer  subjects  were  recruited,  evaluated,  and  employed  in  accordance 
with  procedures  specified  in  Secretary  of  the  Navy  Instruction  3900.39 
series  and  Bureau  of  Medicine  and  Surgery  Instruction  3900.6  series.  These 
instructions  are.  based  upon  voluntary  informed  consent,  and  meet  the  provi¬ 
sions  of  prevailing  national  and  international  guidelines. 


..'VS 


STABILIZATION  AND  TASK  DEFINITION  IN  A  PERFORMANCE  TEST  BATTERY 


Marshall  B.  Jones 

Kennedy  and  Bittner  (1977)  have  recently  detailed  the  need  for  a 
performance  test  battery  to  study  the  effects  of  unusual  environments 
over  prolonged  exposure  periods.  The  same  authors  also  point  out  that 
performance  testing  in  environmental  research  almost  always  involves 
repeated-measure  designs.  This  latter  circumstance  has  definite  conse¬ 
quences  for  the  properties  that  a  performance  test  or  battery  should  have. 

When  a  task  or  test  is  administered  on  repeated  occasions,  it  usually 
shows  practice  effects;  and  these  effects  may  appear  in  the  mean,  the 
variance  among  subjects,  or  in  the  correlations  over  subjects  among 
trials  or  repetitions.  If  practice  is  continued,  there  comes  a  point 
in  many  tasks  after  which  practice  effects  no  longer  appear;  as  we  will 
put  it,  the  task  stabilizes .  The  mean  becomes  asymptotic  or  increases  at 
a  slow  and  regular  rate;  the  variance  among  subjects  remains  the  same  from 
trial  to  trial;  and  the  correlation  with  trials  earlier  in  the  practice 
sequence  remains  the  same  from  one  stabilized  trial  to  another;  in  addition, 
the  correlation  between  any  two  stabilized  trials  is  constant.  Not  all 
tasks  stabilize,  however,  and  among  those  that  d  some  stabilize  more 
quickly  than  others  (Jones,  1972).  Furthermore,  different  tasks  may  stabilize 
at  different  levels;  that  is,  the  average  correlation  among  stabilized  trials 
may  vary  from  one  task  to  another  (Jones,  1970  a  &  b) . 

If  a  task  does  not  or  has  not  been  stabilized  in  a  group  of  subjects, 
its  use  in  a  repeated-measure  design  is  compromised  from  the  start.  If  the 
data  are.  analyzed  by  univariate  analysis  of  variance,  one  of  the  requirements 


of  a  repealed- measure  design,  compound  symmetry,  may  not  be  met,  with  serious 
consequent  difficulties  for  the  analysis  (Winer,  1971).  These,  difficulties 
can  be  overcome  in  large  measure  by  resort  to  multivariate  statistical  methods, 
but  only  if  the  subjects  outnumber  the  repeated  measurements,  preferably  by 
a  considerable  margin  (Morrison,  1967) .  This  condition,  however,  is  often 
difficult  or  impossible  to  meet  in  field  experiments  under  unusual,  environ¬ 
mental  circumstances. 

If  the  battery  or  tests  from  it  are  used  to  monitor  individual  performance , 
further  difficulties  arise.  If  a  task  has  not  been  stabilized,  the  correlations 
among  successive  trials  will  very  likely  show  "superdiagonal  form"  (Jones,  1969a). 
That  is,  the  correlation  between  two  trials  decreases  with  the  separation  between 
them  and,  hence,  is  largest  when  one  trial  immediately  follows  the  other.  This 
pattern  has  been  interpreted  by  some  workers  to  mean  that  the.  differential  compo¬ 
sition  of  the  task  is  changing  and  by  others  to  mean  that  the  abilities  possessed 
by  the  subjects  are  changing  (Alvares  and  Hulin,  197?.,  .1973;  Dunham,  1974). 

Under  either  interpretation  an  Individual's  performance  could  deteriorate  or 
improve  over  a  given  span  of  testings  for  reasons  that  have  nothing  to  do  with 
concurrent  environmental  stresses  or  events.  If  the  task  is  changing,  the 
subject  may  do  poorly  because  he  happens  to  be  weak  in  the  abilities  or  othe.r 
factors  that  are  prominent,  in  the  differential  composition  of  the  test  over  that 
particular  span  of  testings.  If  the.  subject's  abilities  are  changing,  then 
clearly  his  or  her  performance  may  change  also  and  altogether  independently  of 
external  factors. 

The  presence  of  supcrdiagonal  form  also  makes  it.  difficult  to  know  "whnt 
is  being  measured."  To  begin  with,  there  is  the  ambiguity  us  to  basic  inter- 


pretation.  Is  it  the  task  or  the.  subject  who  is  changing  or,  perhaps,  both? 

If  the  task  is  changing,  then  the  interpretation  of  performance  changes  with 
every  stage  of  practice.  If  the  subject  is  changing,  then  he  or  she  possesses 
a  somewhat  different  mix  of  relevant  abilities  at  every  stage  cf  practice. 

For  all  these  reasons  a  test,  to  be  included  in  a  performance  test 
battery  for  environmental  research,  should  stabilize.  For  many  trials  or 
administrations  the  test  may  show  practice  effects  but  there  must  come  a  point 
after  which  the  test  (or  subject)  no  longer  changes.  It  is  additionally  desirable 
that  the  test  stabilize  at  a  high  level,  preferably  greater  than  .80.  We  will 
call,  the  level  at  which  a  task  stabilizes,  that  Is,  the  average  correlation 
among  stabilized  trials,  task  definition.  A  task  is  well  or  poorly  defined 
according  as  this  average  ranges  up  or  down  from  .80. 

The  primary  concern  of  the  Naval  Aerospace  Medical  Research  laboratory  bet. 
is  with  inertial  environments,  a  particular  interest  being  the  very  low  frequency 
motions  (  <,  1  Hz)  which  occasion  seasickness  and  air  sickness.  In  this  connection, 
a  research  program  is  underway  to  develop  a  test  battery  for  evaluating  the 
performance  of  a  subject  who  may  be  exposed  to  such  motions,  'lhe  general  plan 
of  this  Performance  Evaluation  Test  for  Environmental  Research  (.PETER)  is 
discussed  elsewhere  (Kennedy  and  Bittner,  1977)  ;  other  findings  are  reported 
in  Kennedy  and  Bittner  (1978  a  &  b)  . 

This  report:  concerns  ten  tasks  each  of  which  has  been  practiced  for  15  days 
by  18  or  19  volunteer  subjects.  In  each  case  our  chief  concern  will  be  whether 
or  not  the  task  stabilizes  and,  if  so,  after  how  many  trials.  The  remainder 
of  the  report  is  organized  into  three  sections.  The  first  develops  the  analysis 
to  be  used  throughout  the  report,  with  Critical  Tracking  serving  as  an  illustrative 


task. 


The  second  section  presents  the  findings  for  five  tasks  all  of  which 
stabilize  quickly  and  have  acceptable  task  definition:  Code  Responses, 

Grammatical  Reasoning,  Arithmetic,  Stroop  Color-Words,  and  Two-Dimensional 
Tracking.  (Critical  Tracking  also  stabilizes  with  acceptable  task  definition.) 

The  third  section  presents  results  concerning  four  tasks  (Complex  Counting, 
Time  Estimation,  J, otter  Search,  and  the  Spoke  Trail-Making  Test)  which  either 
do  not:  stabilize  or,  if  they  do,  have  unacceptably  low  task  definition. 

1.  ANALYSIS 

Superdiagonal  form 

Each  of  the  ten  tasks  to  be  studied  in  this  report  was  administered  to  a 
group  of  IB,  sometimes  19,  volunteer  subjects  for  15  consecutive  working  days. 

The  results  to  be  studied  take  the  form  of  15  data  points  for  each  subject  on 
each  task,  each  point  representing  that  subject's  average  performance  on  one 
day.  On  one  of  the  tasks  studied,  Stroop  Color-Words,  three  measures  of  perfor¬ 
mance  were  obtained  for  each  subject  on  each  day;  on  another  task,  Arithmetic , 
two  measures  of  performance  were  obtained;  and  on  the  remaining  eight  tasks  only 
one  measure  of  performance  was  obtained.  However,  one  of  these,  tasks,  Spoke 
Trail- Making,  existed  in  two  forms,  experimental  and  control;  in  effect,  it 
constituted  two  tasks. 

Group  means  and  variances  for  each  task  on  each  day  have  already  been 
presented  in  another  place  (Kennedy  and  Bittner,  1978a).  Therefore,  while 
occasional  reference  may  be  made  in  this  report  to  means  and  variances,  the 
focus  will  bo  on  correlation.  Our  concern  will  be  to  determine  which  tasks, 
if  any,  are  differentially  stable.  That  is,  does  there  come  a  point  In  practice, 
on  the  task  where  the  position  of  one  subject  relative,  to  another  does  not  change, 


except  for  random  error,  as  long  as  external  circumstances  and  subjective 
conditions  remain  the  same?  It  should  be  underscored  that  Instability,  as 
we  use  the  term,  does  not  consist  in  differential  change  per  sja  but,  rather, 
in  endogenous  change,  that  is,  change  resulting  from  practice  alone.  If  the 
relative  ordering  among  subjects  changes  in  response  to  some  unusual  environ¬ 
mental  circumstance,  for  example,  an  immediately  preceding  holiday  or  partial 
failure  of  the  air  conditioning  system,  the  fact  is  no  argument  against  stability. 
Indeed,  sensitivity  to  altered  environmental  or  subjective  conditions  is 
generally  desirable  in  a  performance  test.  What  is  not  desirable  Is  a  change 
in  differential  structure  as  a  function  of  simply  taking  the  test  or  taking  it 
again. 

But  how  are  we  to  know  whether  a  change  in  differential  structure  is  endo¬ 
genous  or  not?  Plainly,  this  question  must  have  a  satisfactory  answer  or  other¬ 
wise  no  attempt  to  determine  task  stability  can  succeed;  fortunately  it  does. 

Correlations  among  trials  of  practice  are  usually  patterned  and,  when  they 
arc,  are  always  patterned  in  the  same  way.  This  pattern,  moreover,  is  almost 
always  associated  with  change  in  the  mean  or  variance;  and  the  more  pronounced 
the  change  in  the  mean  or  variance  the  rarer  it  is  that  this  pattern  is  not 
found.  In  addition,  this  pattern  is  easily  shown  to  depend  on  uni  form  external 
circumstances.  That  is,  by  altering  lost  circumstances  or  subjective  conditions 
the  pattern  can  be  disrupted  or  even  obliterated  (Jones,  1969b).  Finally,  this 
pattern  is  naturally  and  easily  explained  in  terms  of  continuous  endogenous 
processes ,  each  one taking  root  at  a  definite  point  in  practice,  continuing  for 
o  series  of  consecutive  trials,  and  then  dropping  out.  On  all  counts,  therefore, 
this  pattern  appears  to  be  the  correlational  counterpart  of  endogenous  differential 


6 


change.  To  determine  whether  or  not  a  task  has  stabilized  it  is  sufficient 
to  find  out  whether  or  not  this  patte.rn  is  still  present. 

The  pattern  in  question  is  "superdiagonal  form."  The  correlations  are 
largest  between  two  trials  (in  our  case,  days),  one  of  which  immediately  follows 
the  .tidier  in  practice.  Correlations  between  trials  that  are  separated  by  one 
trial,  for  example,  days  5  and  7,  are  smaller.  The  greater  the  separation 
between  two  trials  the  smaller  the  correlation  between  them  is.  Hence,  the 
smallest  inter trial  correlation  is  found  in  the  upper  right-hand  corner  of  the 
matrix . 

The  correlations  in  any  one  row  all  involve  the  same  first  trial,  with 
the  second  trial  being  more  and  more  removed  in  the  practice  sequence.  Similarly, 
the  correlations  in  any  one  column  all  involve  the  same  second  trial,  with  the 
first  trial  coming  earlier  and  earlier  in  the  practice  sequence.  The  require¬ 
ments  of  superd.l agonal  form  are  that 

i  /Z 

,  V  * 

and 

That  is,  the  correlations  must  decrease  along  the  rows  to  the  right  and  up  the 
col umns . 

( ! cneral  versus  local  differential  change 

Tlie  Idea  of  stability  does  not  apply  to  the  task  itself  or  even  to  ali  trials 
of  practice  on  the  task  but,  rather,  to  all  trials  past  a  certain  point  or,  better, 
to  all  trials  between  two  points  In  practice.  In  our  case  we  will  start  each 
analysis  by  asking  whether  or  not  all  trials  after  day  5  are  stable,  that  is, 
trials  6  through  15.  Once,  however,  we  recognize  that  stability  is  specific  to 


7 


a  series  of  trials,  not  generally  beginning  with  trial  1,  then  we  must  also 
recognize  that  it  exists  in  two  distinct  forms. 

Consider  our  own  case,  that  is,  trials  6  through  15.  If  these  ten  trials 
are  part  of  an  overarching  superdiagonal  form  that  begins  with  trial  1  and 
continues  through  trial  15,  one  consequence  is  that  trial  6  must  correlate 
more  strongly  with  the  first  five  trials  than  trial  7  does;  trial  7  must 
correlate  more  strongly  with  the  first  five  trials  than  trial  8  does,  and  so 
on.  In  other  words,  continuing  differential  change  in  trials  6  through  15  means, 
among  other  things,  that  each  successive  trial  in  this  10-trial  series  is  more 
and  more  removed  from  the  first  five  trials.  As  practice  continues,  each 
successive  trial  correlates  less  and  less  strongly  with  a  fixeu  set  of  preceding 
trials . 

At  the  same  time,  an  overarching  superdiagonal  form  implies  that  the  corre¬ 
lations  among  trials  6  through  15  have  superdiagonal  form  also.  If  differential 
change  continues  over  these  ten  trials,  then  intertrial  correlation  over  this 
same  series  considered  entirely  itself  must  be  patterned  in  the  superdiagona] 
way . 

We  will  call  these  two  kinds  of  instability  general  and  local  differential 
change.  General  differential  change  takes  place  relative  to  an  external  set  of 
measures.  In  this  report  these  external  measures  are  always  preceding  trials  of 
practice  on  the  same  task.  The.  idea,  however,  of  general  differential  change  also 
includes  change  relative  to  other  tasks.  If  the  correlations  between  successive 
trials  of  practice  and  a  reference  test  regularly  either  decrease  or  increase, 
the  fact  is  evidence  of  general  differential  change.  Local  differential  change 
is  change  within  a  series  of  consecutive  trials.  A  series  of  trials  that  shows  no 


change  with  respect  to  external  measures  may  nevertheless  be  changing  from 
trial  to  trial  internally.  Local  change  is  not  just  another  aspect  of  a 
single  underlying  differential  process.  Local  and  general  differential 
change  are  two  different  things  and  do  not  necessarily  occur  together. 

This  last  point  is  central,  in  part  because  general  and  local  differential 
change  do  not  have  equal  claims  on  our  attention.  Suppose  that  all  trials  on 
a  task  after  trial  are  entirely  stable  in  the  general  sense.  Any  local 
instability  that  the  task  may  then  show  is  altogether  specific  to  trials 
after  on  that  task.  A  structuring  of  specific  variance,  however,  has  very 
few,  if  any,  practical  consequences.  It  has  no  appreciable  effect  on  the 

ability  of  the  trials  at  issue  either  to  predict  external  measures  or  be 

predicted  by  them.  Let  S,  consisting  of  trials  s  8  .......  s  .  ,  be 

nH~X  TTT rZ  m-rn 

entirely  stable  in  the  general  sense  and  c  an  outside  criterion.  Then  the 

correlations  between  s  ,  ;  and  c  are  all  the  same.  Suppose  now  that  the 

irrf  i 

correlations  among  the  s  ,  •  arc  either  all  the  same  or  patterned  in  the 

mi-  *> 

superdiagonal  way.  How  much  difference  does  this  last  variation  have  on  the 
multiple  correlation  between  S  and  c?  Hie  answer  in,  not  much  unless  the 
superdiagonal  pattern  within  S  is  steep;  and  steep  superdiagonal  patterns 

(hence,  rapid  local  change)  do  not  exist  or  have  not  been  observed  in  the 

absence  of  general  dl fferentiai  change. 

In  actice,  of  course,  we  do  not  test  for  all  possible  kinds  of  general 

change.  In  this  report  we  will  look  for  it  only  In  relation  to  preceding 
(or,  occasionally,  following)  triel.s  on  the  same  task,  not  other  "-asks  or 
criteria.  If,  however,  a  series  of  trials,  S,  is  stable  relative  to  preceding 


'.fr  ■ 


n,- 


trials  and  the  average  correlation  among  trials  in  S  (task  definition)  does 
not  greatly  exceed  the  average,  correlation  between  these  trials  and  j  ust 
preceding  trials,  then  the  probability  that  the  trials  in  S  change  appreciably 
with  respect  to  any  external  criterion  is  low.  Virtually  all  of  the  reliable 
variance  in  S  is  accounted  for  by  its  relations  with  -just  preceding  trials  and 
these  trials,  by  hypothesis,  all  correlate,  equally  with  trials  in  S.  It  is 
technically  possible  for  another  task  or  external  measure  to  show  differential 
change  relative  to  S  but  not  likely  and  certainly  not  in  a  large  way. 

Local  change  is,  therefore,  a  distinctly  secondary  matter.  If  general 
stability  exists,  local  change  has  lew,  if  any,  practical  consquenccs .  If 
general  stability  does  not  exist,  local  stability  la  unlikely  and  cannot  in 
any  case  gainsay  continuing  general  change. 

In  the  next  two  sections  we  will  consider  how  to  test  for  general  and  local 
differential  change.  We  will  then  take,  up  one  or  two  matters  that  concern  both 
problems . 

Testing  for  general  dLCCe rentlal  change 

Table  1.  contains  the  correlations  between  the  first  five  and  the.  last  ten 
days  on  Critical  Tracking.  Table  2  presents  the  analysis  of  variance  for  these 
same  data,  with  the  row  and  column  effects  broken  down  into  linear  and  nonlinear 
components . 

The  row  averages  show  large  and  overwhelmingly  significant  increases  for 
r.hc  first  five  days.  The  regression  coefficient  for  the  rows  is  +0.118  or  a 
predicted  Lncrease  of  +0.472  from  day  I.  to  day  5.  Tills  result  means  that:  the 
first  five  days  definitely  involve  differential  change.  If  we  let  S  consist  of 
the  first  five  trials,  then  the  last  ten  trials  are  an  external  measure;  and  with 


10 


respect  to  this  external  mpasure  the  trials  In  S  show  re  pul  ar  (increasing) 
change.  it  is  clear,  therefore,  that  Critical  Tracking  does  not  stabilize 
hi’ fore  day  6. 

Tin:  main  result,  however,  is  that  the  1 inuar  component  in  the  column 
averages  is  not  significant  (F  =  0.93)  when  tested  against  the  residual 
mean  square.  The  regression  coefficient  for  the  columns  is  -0.004  per  day. 
Therefore,  the  predicted  average  declines  by  0.036  from  day  6  through  day  15. 

Note,  however,  that  the  nonlinear  component  in  the  column  averages  is 
significant  at  the  .05  level.  Since  the  hypothesis  of  endogenous  change  does 
net.  require  linear  hut  only  monotonia  decrease  along  the  columns,  the.  nonlinear 
component  might  Involve'  real  elements  of  differential,  change.  Critical  Tracking, 
however,  in  not  a  case  111  point.  The  nonlinear  component.  Is  significant 
because  the  column  averages  depart  irregularly  from  the  linear  regression  line, 
not.  because  the  regression  line.  Is  itself  nonlinear.  We  have  already  pointed 
out,  however,  that  irregular  variations  in  the  column  means  do  not  constitute 
evidence  of  endogenous  change.  On  a  Monday,  for  instance.,  the.  correlations 
arc  sometimes  lower  than  oil  other  days  of  the  week,  presumably  because  the. 
subjects  have  lost  some  of  their  edge  and,  perhaps,  some  of  their  motivation 
as  we  I  I  over  the  weekend.  The  result,  though  it  might  well  lead  to  a  significant 
nonlinear  column  component,  does  not  constitute  evidence  of  endogenous  differential 
change.  The  change  results  from  an  alteration  in  external  circumstances  (the 
weekend),  not  from  practice  ItscLI. 

Wc  may.  conclude,  therefore,  that  Critical  Tracking  is  Stable  with  respect  to 
preceding  trials  after  day  5,  But  in  Critical  Tracking  after  day  5  stable  with 
respect  to  other  external  measures  than  preceding  trials?  The  evidence  we  have 
on  this  point  i s  Indirect. 


11 


The  average  correlation  among  days  6  through  15  on  Critical  Tracking  (task 
definition)  is  0.784.  The  mean  correlations,  however,  between  days  4  and  5  and 
these  same  ten  trials  are  .760  and  .807  respectively.  In  short,  all  of  the 
reliable  variance  in  days  6  through  15  is  accounted  for  by  their  correlations 
with  days  4  and  5.  Another  external  measure,  therefore,  would  almost  have  to 
relate  to  days  6  to  15  through  some  component  that  these  ten  days  share  with 
days  4  and  5.  It  could  be,  of  course,  that  some  components  in  days  4  and  5 
increase  over  the  next  ten  days  and  some  decrease;  but  this  seems  unlikely. 

If,  howevej ,  all  components  that  days  4  and  5  share  with  the  next  ten  days 
are  stable,  then  the  correlations  between  another  external  measure  and  days  6 
to  15,  mediated  as  they  would  almost  have  to  be  by  one  or  more  of  these  components, 
should  also  he  stable. 

All  in  all,  therefore,  it  seems  fair  to  conclude  that  Critical  Tracking  is 
generally  stable  after  day  5. 

Testing  for  local  differential  change 

A.  The  probl  em.  Tab  le  3  presents  the  correlations  among  clays  6  through  15  on 
Critical  Tracking.  The.  question  is,  does  this  matrix  have  significant  elements 
of  superd lagoual  form?  That  some  such  elements  appear  In  the  matrix  is  clear 
from  visual,  inspection.  The  correlation  between  days  6  and  15,  for  example,  is 
smaller  (.71)  than  any  correlation  In  the  superdlagonal.  In  fact,  the  average 
of  the  three  correlation:;  in  the  upper  right-hand  corner  of  the  matrix  (.88  + 

.71  I  .50/3  =  .70)  Is  also  smaller  than  any  correlation  in  the  superdlagonal. 

But  are  these  differences  significant?  That  is  the  question  to  which  we.  now  turn. 

B.  Background,  If  a  task  is  stable,  over  a  series  of  trials,  S,  then  except  for 
sampling  variations  all  correlations  among  trials  in  8  are  equal  (the  matrix  is 


;w  -- 


t 


12 


flat).  One  possible  approach  to  our  question,  therefore,  might  be  to  test 
the  observed  matrix  against  the  hypothesis  of  a  flat  matrix  at  the  population 
level.  Fortunately,  Law ley  l\aa  advanced  a  ^  tent  for  precisely  this  question 
(Morrison,  1967,  pp .  251-252). 

Unfortunately,  there  are  serious  problems  with  the  use  of  this  test  for 
our  purposes.  If  Law ley  s  test  results  in  a  significant  value  of  ,  we  will 
conclude  that  superdiagonal  form  is  present.  That  is,  the  alternative  hypothesis 
to  equality  of  all  correlations  among  trials  in  S  is  superdiagonal  form.  The. 


problem  is  that  I.awlcy'n  test  may  result  in  a  significant  value  of  /\  for  reasons 
that  have  nothing  to  do  with  superd I  agonal  form. 

Suppose,  for  example,  that  the  correlations  among  trials  in  S  can  be 
perfectly  described  by  a  single  common  factor  with  unequal  factor  loadings. 

Such  a  matrix  will,  not  he  flat  and,  if  the  loadings  are  appreciably  different, 
will,  almost:  certainly  vie  l.d  a  significant  result:  by  I.awley's  test.  This  result 
would  he  seriously  misinterpreted,  however,  by  a  conclusion  In  favor  of 
(utpcrdiagonu I  form.  A  uni t- factor  matrix  never  has  superdiagonal  form  and 
superdiagonal  1 orm  etui  never  In1  explained  in  terms  of  a  single  common  factor. 

1 -aw lev' s  test,  works  wel  l,  enough  as  long  as  'jC.  is  not  significant.  That  is. 
If  all  correlations  among  trials  In  S  are  tenably  regarded  an  equal,  then  local 

a. 

differential,  change  is  absent.  If  ^  Is  significant,  however,  we  can  not 
conclude  that  differential  change  in  present: .  To  draw  this  conclusion  we. 


must  take  some  other  .approach, 

A  likely  possibility  Is  .Toreskog's  well-known  procedures  for  testing  the 
simplex  model  (Joreskog,  1970).  The  simplex  Is  a  special  kind  of  superdiagonal 
form.  Joreskog  hypothesizes-,  a  simplex  and  then  develops,  first,  a  maximum- 


13 


KS 


K.«* 


jjSjlgr  , 

'.KrVa: 

;I 

W  • 

c>i,  . 


likelihood  procedure  for  estimating  the  theoretical  correlations  and,  then, 

v-31 

a  ^  test  for  finding  out  whether  or  not  these  correlations  are  adequate 
to  explain  the  empirical  results. 

Unfortunately,  Joreskog'a  test  is  also  inappropriate  for  our  purposes. 

To  make  the  point  directly,  suppose  that  the  empirical  matrix  is  essentially 
flat.  Since  a  flat  matrix  is  a  special  case  of  simplicial  form,  Joreskog's 
model  will  fit  the  data  perfectly.  Hence,  we  would  conclude  in  favor  of  the 
simplicial  Interpretation  or,  more  generally,  supcrdiagonal  form  and  differential 
change.  But  a  flat  matrix  is  the  very  opposite  of  differentia]  change.  The 
trouble  with  the  simplex  model  is  that  it  explains  both  change  and  no  change. 
Hence,  it  cannot  distinguish  between  them. 

What  we  need  is  a  hypothesis,  like.  Lawley's,  that  posits  no  differential 
change  and  an  empirical  statistic,  that  reflects  it:.  Then,  if  we  do  not.  obtain 
significance  (and  the  test  wc  use  has  sufficient  power),  we  may  conclude  in 
favor  of  stability.  On  the  other  hand,  if  we  obtain  significance. ,  wc  can 
conclude  in  favor  of  differential  change.  In  the  next  section  we  present  such 
a  test. 

C.  Diagonal  comparisons.  We  begin  with  a  change  in  notation.  Let  r  .  •  be 

4  ^ 

r.hc  ith  correlation,  reading  down  and  to  the  right,  in  the  jth  diagonal,  reading 
away  from  the  main  diagonal.  Thus,  the  third  correlation  in  the  superdl agonal 
(.92  in  table  3)  is  r^.|  ,  and  the  fourth  correlation  down  in  the  last  column 
(.78  in  table  3)  is  r^.  Plainly, 


I  j  —  n  -  I 

I  h.  i  —  i''  -  ^  • 


14 


We  advance  the.  mode.l 


£ m  +  {  ■) 


where  Q.  is  a  fixed  effect  associated  with  diagonal  j  and  all  £  ■<  are 
random  variables  drawn  from  a  normal  population  with  a  mean  of  zero  and 

variance,  C7  .  Except  for  error,  all  correlations  in  any  one  dlagnonal 

c 

are,  wo  suppose,  equal.  The  least-squares  estimator  for  i:  is  Jt  .  ; 

this  estimator  is  unbiased. 

The  comparisons  we  propose  to  make  are  based  on  the  differences  among  the 


diagonals.  Let 


*  '  4  Vi-  » 

Z\ 

L-  ^  .  A 

,  I  (»-<*)  = 


'A J;  ■ 


Then  consider  the  quantities 


C  j  -  n~*)  . 


Each  C  •  represents  the  difference  between  the  average  correlation  in  the  ]tU 

& 

diagonal  and  the  average  correlation  in  all  diagonals  greater  than  j,  that  is, 
to  the  "northeast"  of  the  jth  diagonal.  C7  is  the  difference  between  the 
average  correlation  in  the.  superdiagonal  and  the  average  of  all  other  correlations 


in  the 


matrix.  C,?  is  the  difference  between  the  average  correlation  in  the  second 


diagonal  and  the  average  of  all  correlations  that  span  more  than  three  trials. 
Finally,  note  that  no  comparison  is  defined  for  ;]  =  n-i. 

The  quantities,  C ^ ,  are  comparisons ;  that  la,  the  sum  of  the  coefficients  of 

r  ..  In  C  vanishes.  All  correlations  In  the  jth  diagonal,  (n-j)  of  them,  are 

i & 


.  .  '-W* 


"V 


15 


multiplied  by  (1/n-j);  and  all  correlations  in  diagonals  (j+1)  to  (n-1) ,  N  ^  of 
them,  are  mul  tiplied  by  (--1/Nj+^).  Hence,  the  sum  of  all  nonzero  coefficients  in  any 
one  Cj  is  zero. 

Furthermore,  the  (n-2)  comparisons,  C.,  are  all  orthogonal  to  each  other. 

Given  any  two  comparisons,  one  of  them  (say,  Cjj)  has  nonzero  coefficients  only 
for  correlations  all  of  which  are  included  in  R  Therefore,  the  sum  of 

cross-products  between  the  coefficients  in  C.  and  C./  is 


I 


N; 


\  2_ 


o 


^  f  I  m  '  "8  *  * '  i  T  '  ’  -J 

The  next  step  is  to  calculate  the  sums  of  squares  attributable  to  the  (n-2) 

orthogonal  comparisons,  SS(C.).  Since  each  comparison  has  one  degree  of  freedom, 
SS(C.)  and  MS((h)  are  the  same.  Finally,  we  will  determine  the  expected  value 
of  MS  ( C  . ) . 


The  sum  of  squares  attributable  to  C. 


the  coefficient  of  r  .  .  in  C  .  Hence, 


4*1  ^SS» 


iZZLlLl 


All  told  ,  there.  Co  re , 


s  5  CC;)  =  ~  *> ) 

:d  value  of  this 

[MdCc,)] 


The  expected  value  of  this  quantity 

(*~  jTl) 


^  ' 

where  A.  is  the  Deputation  counterpart  of  R..  That  is, 

*  h-H  n-l  1 

=  %  J  W;. 

It  only  remains  to  determine  a  proper  estimate  of  \J- 


T 

£±.  )  , 

;  ♦»/  > 


If) 


In  tills  connection  it  may  be  helpful  to  contrast  the  analysis  into 
diagonal  components  with  simple  analysis  of  variance.  Table  4  presents 
tin:  sources  of  variation,  sums  of  squares,  and  degrees  of  freedom  for  an 
analysis  of  the  n(n-l)/2  correlations  into  between-  and  within-diagonal 
components.  Tills  is  a  simple  (one-way)  analysis  of  variance  with  unequal 
numbers  in  the  (n-l)  groups. 

Table  5  presents  the  sources  of  variation,  sums  of  squares,  and  degrees 
of  freedom  for  the  same  correlations  analyzed  by  diagonal  comparisons.  In  this 
light  the  diagonal  comparisons  are  simply  another  way  of  breaking  down  the 
b  o  two  on- d i a  gona )  variation;  in  fact,  the  two  sums  of  squares  are  equal.  That 

T  s  s  Ccj )  =  Z 
I  (*;  -  =  Z 

frr  O  -  d  * 1 )  v  '  d  * ' 

Hence,  the  wi thin- diagonal  SS  in  the  simple  analysis  of  variance  is  tlie  same  as 


file  residual  SS  in  the.  diagonal  comparisons.  That  is, 


f_  T  (■'V-A)3-  1?S5(,Cj)  =  f  Z 

fr,  fa  *  '  K  *  '  <*<  ^  *  ’ 

m  find  that  «  .  .  /  — 1 

E  [  =  E  lit 


We  tlicti  find  that 


*»'  1^1 


~  & T 

tr  - 

in  both  analyses.  It  should  be  noted,  the  degrees  of  freedom  come  to  one  leas 
than  tlie  number  of  correlation.';  In  the  matrix.  In  the  case  of  the.  simple  analysis 


tills  one  degree  of  f rcedom  is  absorbed  by 


In  diagonal,  comparisons  i  t  is 


absorbed  bv 


17 


i. 

» • 


■f 


The  residual  or  wi thin-diagonal  SS  may  be  further  broken  down  into  linear 
and  nonlinear  components  as  a  function  of  trial  or  day  number.  This  further 
analysis,  however,  is  entirely  straightforward  and  will  be  taken  up  in  the 
context  of  a  concrete  example.  'Hie  formal  framework  for  an  analysis  into 
diagonal  comparisons  is  now  in  hand. 

D.  The  first  five  days  on  Critical  Tracking.  Table  6  presents  the  correlations 
among  the  first  five,  days  on  Critical  Tracking.  Note  that  the  correlations  increase 
strongly  in  the  comparison  diagonals.  This  tendency  for  the  correlations  to 
increase  with  practice  is  common  where  differential  change  is  taking  place, 
especially  if  it  is  rapid.  It.  is  not,  of  course,  in  itself  evidence  of 
differential  change.  A  unit  factor  matrix  with  increasing  factor  loadings, 
for  example,  would  show  the  same  effect.  A  decision  as  to  whether  differential 
change  is  present  or  absent  depends  solely  on  the  diagonal  comparisons.  At  the 
same  time,  regular  change  within  diagonals  Is  an  assignable  source,  of  variation 
and  should  not  lie  included  in  the  error  term. 

Table  7  presents  the  analysis  into  diagonal  comparisons  for  the  first  five 
jays  on  Critical  Tracking.  The  diagonal  comparisons  absorb  three  degrees  of 
. random  and  the  linear  components  within  diagonals  absorb  another  three  degrees 
ii  freedom.  The  residual  term  also  has  three  degrees  of  freedom.  None  of  the 
.•  ratios  fur  the  diagonal  comparisons  reaches  significance  at.  the  ,0'j  level. 
i.n  this  ease,  however,  a  conclusion  that:  no  local  change  is  taking,  place  is  not 
.,ar ranted  because  with  only  five  trials  the  analysis  does  not.  have  sufficient 
power . 

S.  The  last  ten  days.  Table  8  presents  diagonal  statistics  for  the  last  ten 
nays  oil  Critical  Tracking.  Included  are  r  ,  R^  ,  Cp  and  SS(C  ) .  The  average 


18 


correlation,  r.,  Is  largest.  In  the  superdiagonal,  second  largest  in  diagonal 
2,  and  smallest  in  diagonal  8.  The  rank  correlation  between  r.p  and  j,  >-4/  » 

-0.79,  is  significant  at  the  .01  level.  This  result  is  sufficient  in  itself 
to  conclude  that  some  local  change  is  still  taking  place  in  the  last  ten  days 
of  Critical  Tracking. 

Hie  analysis  into  diagonal  comparisons  is  presented  in  table  9.  None  of 
the  diagonal  comparisons  is  significant  at  the  .05  level,  although  the  F  ratio 
for  diagonal  1  falls  midway  between  the  critical  values  for  the  .10  and  ,05 
levels  with  1  and  2.8  degrees  of  freedom.  Certainly,  the  local  tendencies 
toward  superdiagonal  form  in  the  last:  ten  trials  of  Critical  Tracking  are  not 
sufficient  to  upset  our  earlier  conclusion  that  the  task  stabilizes  after  day  5, 
Transformations  and  power  considerations 

The  model  used  in  testing  for  general  differential  change  is 

^,'i  +  <  +  +  > 


where 


yM  ,  <*;  ,  and  fa  arc  fixed  effects  and  £  ^  .  ,  as  usual,  is  a  normally 

^  .  Tills 

model  implies  bilateral  compound  symmetry.  That:  is,  the  expected  variance  along 
any  two  rows  in  the  same  and  the  expected  variance  down  any  two  columns  is  the. 
same.  Similarly  the  expected  covariance  between  any  two  rows  (or  columns)  is 
the  same  as  between  any  other  two  rows  (or  columns) . 

These  consequences  may  uut,  of  course,  be  supported  by  the  facta.  If  the 
columns  evidence  stability,  it.  Is  likely  that  the  column  variances  will  be 
homogeneous.  However,  if  the  rows  show  differential  change,  as  is  usual,  it 
is  likely  that,  the  later  rows  with  higher  average  r's  will  have  smaller  variances. 


*f>  :  i 


19 


Further,  if  there  are  changes  in  the.  variances  along  either  the  rows  or 
columns,  it  is  unlikely  that  either  of  the  two  covariance  requirements  will 
be  met. 

The  best  approach  to  this  problem  is  to  subject  the  correlations  to  Fisher ' s 
transformation.  The  effect  is  to  "lengthen  out"  the  intervals  toward  the  high 
end  of  the  correlation  scale  and  this,  in  turn,  tends  to  homogenize,  the  row 
and  column  variances  and,  hence,  to  improve  the.  case  for  compound  symmetry 
in  all  respects . 

A  related  problem  arises  In  the.  local  analysis.  When  there  is  differential 
change,  the  correlations  within  a  diagonal  tend  to  increase.  This  increase, 
however,  may  not  he  linear  but  negatively  accelerated,  especially  if  the 
correlations  exceed  .80.  The  root  cause  here  is  the.  same  as  in  the  previous 
problem,  namely,  that  numerically  equal  intervals  are  larger  high  in  the  correla¬ 
tion  scale  than  they  are  lower  down.  The  solution  too  is  the  same  as  before, 
Fisher's  r,  transformation  lengthens  out:  the  intervals  at  the  high  end  of  the 
scale  and  thereby  straightens  out  the  regression  with  trial  or  day  number.  The 
importance,  of  this  straightening  out  is  that  it  purifies  the  error  term,  by 
removing  from  it  a  known  source  of  systematic  error. 

As  intimated  earlier  in  at  least  two  places,  power  may  also  be  a  problem. 

If  we  find  no  significant  difference  from  one  column  to  the  next  in  the  general 
analysis,  wc  conclude  that  the  task  has  stabilized.  Clearly,  however,  this 
conclusion  raises  the  power  question.  What  is  the  probability  that  we  would 
have,  obtained  a  nonsignificant  result  if  the  regression  coefficient  along  the 
columns  had  been  b  ^  0?  With  five  days  or  trials,  the  power  of  the  general 
analysis  is  certainly  not  strong  enough  to  warrant  a  conclusion  of  stability. 

With  ten  days,  however,  It  is  much  stronger,  although  even  longer  series  of  trials 


20 


would  be  desirable.  There  Is,  however,  a  limit  to  the  amount  of  work  that 
empirical,  investigators  can  be  expected  to  do  in  order  to  meet  statistical 
requirements . 

In  this  connection  it  is  worth  noting  that  failure  to  meet  the  require¬ 
ments  of  compound  symmetry  in  the  general  analysis,  while  it  gives  the 
analysis  a  positive  bias,  also  increases  its  powe.r. 

The  power  question  also  arises  in  relation  to  the  local  analysis.  If  we 
find  no  significant  diagonal  effects,  we  conclude  that  superdiagonal  form  is 
absent.  But  what  is  the  probability  that:  we  would  have  obtained  a  nonsignificant 

result  if  (  <£,  —  )  had,  in  fact,  equalled  a  definite  nonzero  amount?  Here 

d  rf*  i 

again  our  only  resort  is  to  longer  sequences  of  trials.  Finally,  power  decreases 
as  one  moves  away  from  the  main  dLagonal. 

2 .  FIVE  STABLE  AND  WELL-DEFINED  TASKS 

Code.  Responses 

Table  10  presents  the  analysis  of  variance  for  general,  change  in  the  last 
ten  days  relative  to  the  first  five  for  Code  Responses,  The.  key  result  is 
the  value  of  F  (0.52)  for  linear  change  along  the  columns.  The  row  effects 
make  It  clear  that  stabilization  could  not  be  fixed  any  earlier  than  the  sixth 
day.  The  means  for  the  first  five  days  are  not  only  overwhelmingly  significant, 
but  increase  regularly,  with  one  small  inversion,  from  .539  on  the  first  to  .781 
on  the  fifth  day. 

The.  diagonal  averages  among  the  last  ten  days  (table  11)  show  shallow  and 
certainly  innocuous  tendencies  toward  superdiagonal  form.  The  average  contra- 
tlon  among  the  last  ten  days  is  .72.  Tills  value  is  definitely  low  for  task 


definition,  perhaps  too  low.  Certainly,  a  stable  task  with  good  definition 
would  be  preferred  over  Code  Responses. 

Grammatical  Reasoning 

The  results  for  Grammatical  Reasoning  are  novel  in  two  respects.  The  first 
is  that  the  last  ten  days  are  not  stable  relative  to  the  first  five.  The  linear 
column  component  is  strongly  significant.  When  this  happens,  one  moves  to  the 
next  trial  and  sees  if,  perhaps,  the  task  may  not  stabilize  from  this  more 
advanced  point  in  the  practice  sequence.  In  our  case,  we  test  the  last  nine 
days  (days  7  through  15)  against  the  first  six.  If  the  linear  component  is 
still  significant,  one  takes  still  another  step  into  the  nr:  t-ice  sequence  and 
tests  again.  This  process  continues  until  the  trials  that  the  subjects  have 
practiced  after  the  one  being  tested  are  too  few  to  provide  an  acceptably 
powerful  test  of  general  differential  change.  Our  convention  is  to  stop  at 
day  10.  Thus,  we  start  by  testing  the  last  ten  days  against  the  first  five  and 
end,  if  no  stabilization  results,  by  testing  the  last  five  days  against  the  first 
ten.  If  the  linear  component  along  the  columns  is  still  significant,  we  conclude 
that  as  far  as  our  data  go  the  task  does  not  stabilize. 

Table  12  presents  the  average  correlations  for  the  last  nine  days  against 
the  first  six  in  Grammatical  Reasoning.  The  linear  component  along  the  columns 
is  still  significant  (1=20.6)  —  but  only  because  of  the  low  average  on  day  15. 
The  regression  coefficient  (average  _r  regressed  on  day  number)  is  -0.0122.  If 
day  15  Is  dropped,  this  same  regression  coefficient  becomes  -0.00125;  the  latter 
is  ten  times  smaller  in  absolute  terms  than  the  former.  Table  13  presents  the 
analysis  of  variance  for  general  change  in  days  7  through  14  on  Grammatical 
Reasoning  relative  to  the  first  six  days  on  the  same  task.  Grammatical  Reasoning 


22 


is  clearly  stable  over  this  stretch  of  eight  days.  It  remains  to  justify 
dropping  day  15  or,  at  least,  to  explain  what  the  Implications  of  doing  so 
are . 

The  first  point  we  need  to  recognize  Is  that  a  task  may  stabilize  for 
awhile  and  then  start  changing  again;  it  may  plateau,,  if  you  like.  Grammatical 
Reasoning  is  definitely  stable  from  day  7  through  day  14.  Its  being  so,  however, 
in  no  way  requires  that  it  remain  stable  thereafter.  It  Is  perfectly  possible 
that  the  low  column  average  on  day  15  is  simply  the  start  of  a  new  phase  in 
differential  development  on  the  task.  The  odds  are  against  it,  however. 

The  18  volunteers  who  practiced  Grammatical  Reasoning  knew  that  day  15  would 
be  their  last  day  on  this  task.  It  Is  possible,  therefore,  that  some  of  them 
performed  somewhat  differently  on  this  last  day  than  they  ordinarily  did.  In 
other  words,  the  subjects  may  have,  responded  on  day  15  to  the  fact  that  this 
day  was  to  be  their  last  on  this  task.  Such  a  reaction  is  an  exogenous  effect; 
it  is  a  response  to  an  altered  subjective  condition  (the  task  is  ending). 

Occurring,  however,  as  it  docs  at  the  end  of  practice,  the  effect  of  such  a 
reaction  is  to  produce  the  semblance  of  linear  change. 

We  cannot  be  sure,  of  course,  that  this  interpretation  is  correct. 

The  main  point  in  its  favor  is  that  the  regression  coefficient  up  to  day  15, 
that  is,  from  day  7  through  day  14,  is  essentially  flat.  In  addition,  nonlinear 
change  over  this  series  of  eight  trials,  while  significant,  is  modest.  In  general, 
the  column  averages  are  not  bouncing  around  a  great  deal.  Hence,  the  marked 
fall  on  day  15  requires  more  of  an  explanation.  Finally,  the  effect  is  not 
isolated;  we  will  see.  it  again  in  other  tasks.  We  conclude,  therefore,  that 
Grammatical  Reasoning  would  probably  remain  stable  more  or  less  indefinitely  after 


23 


day  6,  provided  subjective  conditions  also  remained  the  same. 

Table  14  presents  the  diagonal  averages  for  days  7  through  14  on  Grammatical 
Reasoning.  With  the  exception  of  diagonal  6  the  fall-off  away  from  the  main 
diagonal  is  regular.  It  is  also  quite  shallow,  however,  and  poses  no  problems 
of  any  consequence.  Task  definition  from  days  7  through  14  is  high,  0.881. 
Arithmetic 

Arithmetic  also  presents  novel  problems.  This  task  has  two  measures, 
number  attempted  and  number  correct.  The  results  for  general  change  in  the 
last  ten  days  relative  to  the  first  five  are  presented  in  tables  15  and  Id . 

Note  that  linear  change  along  the  columns  is  significant  in  both  cases.  The 
regression  coefficients,  however,  are  both  positive!  They  are  also  very  small 
and,  as  it  happens,  equal,  +0.0027.  What  are  we  to  make  of  this  state  of 
affairs? 

'flic  intertrial  correlations  for  Arithmetic  are.  very  high  and  tightly  bunched. 
The  correlations  on  number  attempted  range  from  .85  to  .97  and  on  number  correct 
from  .86  to  .97.  The  residual  term  is  miniscule.  The.  change,  along  the  columns 
for  all  10  day.1  comes  to  only  +0.024,  certainly  a  small,  amount.  But  what  about 
its  direction?  How  are  we  to  account  for  the  increase  in  column  correlations 
over  the  last  ten  days. 

The  answer  lies  in  1  lie  row  effects.  The  changes  here  are  much  larger  and 
more  significant  than  the  column  trends.  And  they  too  go  the  wrong  way!  That 
is,  the  row  means  tend  to  decrease  from  day  1  through  day  5.  On  number  attempted, 
for  example,  the  average  correlation  on  day  1  is  .952  and  on  day  5  it  is  .895. 

These  results  altogether  exclude  differential  change  of  the  superdiagonal 
sort.  In  the  sense  that  we  have  been  using  the  term.  Arithmetic  is  stable  from 


24 


day  1.  For  the  first  week  performance  seems  to  acquire  rather  more  random 
elements  and  thereafter  gradually  to  lose  them.  The  effect  is  to  create  a 
shallow  bow  in  the  correlation  pattern — hut  a  bow  created  by  slow  swings 
in  specific  variance.  The  correlation  matrix  is  a  Spearman  unit  hierarchy  with 
somewhat  smaller  factor  loadings  around  day  5  than  either  before  or  after.  This 
pattern,  howevo ,  is  consistent  with  stability,  in  fact,  stability  of  a  rock- 
solid  order. 

When  they  came  into  service,  our  volunteers  had  already  learned  all  the 
arithmetic,  they  would  over  know.  They  were  already  stabilized  on  this  task 
and  no  change  would  subsequently  occur  In  common  variance,  that  is,  between 
one  day  and  any  other.  I.ocal  change  in  Arithmetic  is  negligible  and  task 
definition  very  high,  0.949  for  number  attempted  and  0.94ft  for  number  correct, 

Tlup  jl troop  Jl'e.s  t 

The  S troop  Teat  yields  three  measures:  )> locks/wo nls ,  colored  words,  and 
colored  blocks .  Tables  17  and  18  present  the  analyses  for  general,  change  in  the 
last  ten  days  for  the  first  two  measures.  Both  measures  are  stable  after  day  5. 

Colored  blocks  presents  a  more  compll rated  picture.  The  V  ratio  for  the 
linear  column  component:  is  7.47,  just  short  of  significance  at  the  .01  level. 

The  regression  coefficient  for  the  column  averages  in  -0.005  or  a  decrease  of 
0.045  over  the  10-day  period.  As  in  Grammatical  Reasoning,  however,  this  decrease 
steins  entirely  from  a  Low  average  for  day  15.  If  day  15  Is  dropped,  the 
coefficient  becomes  ever  so  slightly  positive,  +0.0005;  and  linear  column  change 
is  no  longer  significant  (F=»0.33). 

Out  interpretation  of  these,  results  is  the  same  as  for  Grammatical  Reasoning, 
that  Is,  that.  Jn  all  probability  the  low  average  on  day  15  is  attributable  to  an 


25 


altered  subjective  condition  (the  task  is  ending) .  In  this  case  the 
interpretation  is  supported  by  the  clear  stability  of  the  other  two  measures 
on  the.  same.  task.  We  conclude,  therefore,  that  the  Stroop  Test  is  stable  on 
all  three  measures  alter  day  5. 

Table  19  presents  the  diagonal  averages  for  the  last  ten  days  on  colored 
blocks.  Superdiagonal  form  appears  to  be  entirely  absent.  The  same,  more  or 
less,  is  true,  of  blocks /words  and  colored  words.  Task  definition  is  good  for 
all  three  measures,  0.827  for  blocks /words ,  0.867  for  colored  words,  and  0.883 
for  colored  blocks. 

Two- Dimensional  Tracking 

Although  It  was  not  discovered  until,  after  the  experiment  was  completed,  the 
software  used  in  Two-Dimensional  Tracking  contained  a  "dead"  spot.  When  the 
cursor  was  placed  on  tills  spot,  it  remained  there  with  no  further  control 
movements.  When  the  experimenters  finally  discovered  this  dead  spot,  they 
interviewed  the  subjects  concerning  it.  Several  of  the  subjects  reported  that 
they  discovered  the  spot  around  day  8  and  subsequently  made  use.  of  it  from 
time  to  time  to  "beat"  the  task.  The  existence  of  this  spot  and  its  discovery 
by  some  subjects  (not  all,  apparently)  fundamentally  altered  the  task  after- 
day  8  or  thereabouts. 

Table  20  presents  the  analysis  of  variance  for  days  6-9  on  Two-Dimensional 
Tracking  relative  to  days  1.-5  on  the  same  task.  The  linear  column  component 
is  vanishingly  small.  The  regression  coefficient,  along  the  columns  is  -0.0002. 

With  only  four  trials  as  a  basis,  the  test  fur  linear  column  change  :1s  admittedly 
not  powerful.  Nevertheless,  our  judgment  is  that  Two-Dimensional  Tracking 
stabilizes  after  five  days.  Note,  by  the.  way,  the  strong  linear  component 
down  the  rows.  The  row  means  show  regul  c  change  through  day  5,  Hence ,  stabiliza¬ 
tion  cannot  be  said  to  begin  any  earlier  than  day  6. 


26 


Table  21  presents  the  correlations  between  days  6-9  (the  so-regarded 
stabilized  trials)  and  days  10-15.  For  the.  first  three  days  (10-12)  the. 
correlations  hold  up  well,  but  then  they  fall  dramatically.  It:  appears, 
therefore,  that  beginning  sometime  in  the  third  week  Two-Dimens I onal 
Tracking  underwent  rapid  differential  change,  presumably  because  of  the. 
dead  spot  and  its  discovery.  It:  is  bothersome,  of  course,  that  the  dead 
spot  seems  not  to  have  had  an  immediate  effect  on  differential  processes. 

On  the  other  hand,  It  appears  to  have  had  no  effect  at  all  on  group  processes. 
The  mean  and  variance  show  no  disturbances  at  all  as  a  result  of  the  dead  spot, 
hooking  at  the  two  curves,  no  one  would  suspect  that  anything  unusual  had 
happened  toward  the  end  ol:  Idle  second  week  or  at  any  other  time  in  practice. 

This  point  has  obvious  methodological  importance  since  it  underscores  the 
sensitivity  of  differential  processes  to  changes  that  would  otherwise  go 
unnoticed . 

Task  definition  in  days  6-9  on  Two-Dimensional  Tracking  is  passable  but  not 
good,  0./67. 

5.  FOUR  UNSTABLE  OH  ILL-DEFINED  TASKS 

Comp!  JJountlng 

Table  22  presents  the  analysis  of  variance  for  general  change  in  the  last 
ten  days  of  Complex  Counting  relative  to  the  first,  five  days  on  the  same  task. 
The  F  ratio  for  the  linear  column  component  Is  strongly  significant:  (F~-2.fi  3). 

The  II  tear  column  component  is  still  significant  when  the  last  nine  days  are 
tested  for  general  change  relative  to  the  first  six  days.  In  fact,  the  linear 
column  component  remains  significant  for  the  last  eight:,  seven,  nix  and  five 
days.  Table  2  3  presents  the  results  for  days  l.'l-l.'j  versus  the  first  ten  days. 


27 


The  F  ratio  for  the  linear  column  component  is  smaller  than  in  table  22  but 
still  strongly  significant  (F=14.1). 

This  time,  moreover,  the  decrease  along  the  columns  is  not  due  solely 
to  the  last  day.  The  regression  coefficient  for  days  11-15  is  •0.021  and  for 
days  11-14  It  Is  -0.017,  smaller,  to  be  sure,  but  not  greatly  so.  'Hie  linear 
column  component  for  days  11-14  relative  to  the  first  ten  days  is,  moreover, 
still  significant,  albeit  at  the  .05  level. 

These,  gradually  lowering  levels  of  significance  as  one  moves  further  and 
further  into  the  practice,  sequence  suggest  that  at  some  point  Complex  Counting 
does,  Indeed,  stabilize.  That  point,  however,  Is  not  reached  after  ten  days 
of  practice. 

Time  Estimation 

The  results  for  Time  Estimation  are  much  the  same  as  for  Complex  Counting. 
Table  24  presents  the  average,  correlations  of  the  last  five  days  with  each  of 
the  f  i  rst:  ten  days .  The.  main  point  is  that  these,  averages  increase  right  up  to 
and  including  day  10.  lienee ,  the  first  ten  trials  are  changing  relative  to  the 
last  five  as  an  external  measure.  There  is  no  possibility,  therefore,  that  Time 
Estimation  stabilizes  any  earlier  than  day  11.  The  question  is,  does  it  stabilize 
then? 

Table  25  presents  the  analysis  of  variance  for  general,  change  in  the.  last 
five  days  relative  to  the  first  ten  days.  Linear  change  from  row  to  row  is 
enormous;  it  yields  the  largest  F  value  we.  have  seen,  thus  confirming  the 
conclusion  already  reached  that  differential  change  continues  through  day  10. 

Since  linear  change  from  column  to  column  is  also  strongly  significant,  F-l.5.4, 
it  would  seem  that  Time  Estimation  is  still  not  stable  after  ten  days. 


28 


There  is,  however,  reason  to  pause,  for  a  moment .  The  correlation  average. 
I'or  day  l!i  is  lower  than  for  any  other  day  in  the  last  week.  If  day  15  is 
omitted,  the  regression  coefficient  along  the  columns  drops  from  -0.038  to 
-0.018  and  the  linear  column  component  Is  no  longer  significant.  If  the  Low 
r  for  day  15  can  he  reasonably  attributed  to  altered  subjective  conditions 
(the  task  is  ending),  then  a  case  could  he.  made  that  Time  Estimation  stabilizes 
after  ten  days.  We  do  not  think,  however,  that  the  low  r  oil  day  .15  can  be  so 
attributed. 


In  the  first,  place,  both  column  and  row  averages  bounce  around  a  good  deal 
i.n  Time  Estimat  ion.  The  drop,  for  example,  from  day  1.2  to  day  13  is  almost 
as  Large  as  the  drop  from  day  14  to  day  15,  Similarly,  as  can  be  seen  in 
table  22,  tiu>  row  averages  also  change  sharply  from  one  day  to  the  next,  lienee, 
the.  drop  from  day  14  to  day  1.5  Is  by  no  manner  of  means  a  unique  occurrence,  in 
this  set  of  data. 

In  the  second  place,  local  change  appears  to  continue  in  Time  Estimation 
through  the  Last  I  I  ve  days.  Table  2(>  presents  the  diagonal  statistics  for 
days  II  through  1.5  and  table  27  the  analysis  Into  diagonal  components.  The. 

E  ratio  for  diagonal  3  (7.1)  is  significant  at  the  .1.0  level.  In  addition, 
all  three  comparisons  arc  positive,  and  the  probability  of  this  result  Is  0.125 
by  itselt  .  This  last,  consideration,  that  is,  how  many  of  the  C  are.  positive 
and  how  many  negative ,  tests  the  same  hypothesis  as  do  the  diagonal  components, 
namely,  that,  local  differential  change,  continues  in  the  last  five,  days  of  practice; 


moron vc r , 


t  does  so  Independently.  Combining  these  four  tests  (Winer,  1967, 


pp>.  49  -50)  yields  a  value,  of  which  Ls  significant  at  the.  .07  level.  Local 
change,  therefore,  appears  to  he  at.  least,  l  ikely  in  days  11  through  15. 


1 


On  these  grounds  we.  conclude  that  Time  Estimation  does  not  stabilize 
after  10  days  and  probably  not  after  15  days.  Task  definition  in  the  lagt 
five  days  is  marginal,  0.718. 

Letter  Sear ch 

Task  definition  for  Letter  Search  is  unacceptably  low.  The  average 
correlation  among  days  6-15  is  0.422  and  among  days  11-15  only  a  little,  better, 
0.510.  Whatever  else  may  be  said  about  it.  Letter  Search  is  not  a  suitable  task 
for  performance  testing,  whether  it  stabilizes  or  not.  lienee,  wc  pursue  its 
analysis  no  further. 

Epokejl'rai  I  -Making 

Spoke  Trail-Making  wan.  In  effect,  two  tasks,  a  standard  or  control,  task 
and  an  experimental  variation.  On  the  evidence  in  hand  neither  of  these  tasks 
meets  the  requirements  for  performance  testing.  Task  definition  for  the 
experimental  form  is  too  low.  The.  average  correlation  among  days  6-1.5  and 
11-1.5  are  0.444  and  0.502  respectively. 

The  case  of  the.  control,  task  Is  more  complicated.  Testing  t.lie  last:  ten  days 
against  the  first  five  one  finds  an  E  ratio  for  the  linear  column  component  which 
Is  significant  at  the  .05  level.  Testing  the  last,  five  days  against  the  first  ten 
one  finds  a  larger  E  ratio  for  the  linear  column  component,  significant  at  the  ,01 
level.  These  results,  however,  are  due  entirely  to  a  low  r  on  the  next  to  last 
day,  day  14.  If  we  exclude,  day  14,  the  average  r's  of  days  6-15  with  the  first 
five,  days  range  from  a  Low  of  .724  to  a  high  of  .806.  The  average  correlation  of 
day  .1.4  with  the  first  five  days  is  0.452!  It  may  he  that  this  low  r  was  due.  to 
some  unrecorded  change  in  test,  circumstances  or  subjective  c.oud  I  t.ionn ,  T.f  so, 
then  the  standard  form  «. .  Spoke  Trail-Making  is  stable  and  has  good  task  definit  ion. 
On  the  existing  evidence,  however,  we.  have  no  grounds  for  excluding  day  14.  Wo  can 


30 


hardly  argue  that  the  subjects  "wound  down"  on  the  day  before  the  test 
ended  but  not  the  last  day.  For  the  time  being,  therefore,  we  regard  the. 
control  form  of  Spoke  Trail-Making  as  unstable. 


31 


REFERENCES 

Alva  res,  K.  M.  &  Hullti ,  C.  I,.  Two  explanations  of  temporal  changes  in  ability- 
skill  relationships:  A  literature  review  and  theoretical  analysis.  Human 
Factors ,  1972,  14,  295-308. 

Alvare.s,  K.  M.  &  Ilulin,  G.  L.  An  experimental  evaluation  of  a  temporal  decay 
Ln  the  prediction  of  performance.  Organizational  Behavior  and  Human 
Performance,  1. 9 73,  9 ,  169- 1 85 . 

Dunham,  R.  B.  Ability-skill  relationships:  An  empirical  explanation  of  change 

over  time.  Organizational.  Behavior  and  Human  Performance-,  1974,  12  ,  372-382. 

Jones,  M.  B.  Differential  processes  in  acquisition.  In  E.  A.  Bilodeau  and  I.  McD. 
Bilodeau  (Eds.),  Principles  <  •■_):  skll  1_  acquisition .  New  York:  Academic  Press, 
1969.  (a) 

Jones,  M.  16.  Knowledge,  of  results  and  in ter trial  correlations  in  a  simple 
motor  task.  dourna.l__of  Motor  ttelmvl or ,  1969,  1,  331-340.  (b) 

Jones,  M.  B.  A  two-procec.s  theory  till  Individual,  differences  In  motor  learning. 
Psychological  Review,  .1970,  77,  353-360.  (a) 

Jones,  M.  B.  Rate  and  terminal  process  in  skill  acquisition.  American  Journal 
<o  Psychology ,  1970,  83,  222-236,  (b) 

Jones,  M.  B.  Individual  differences.  in  R.N.  Singer  (Ed.),  The.  psychoniotor 
domain.  .Philadelphia:  Lea  and  Febiger,  1.972. 

Joreskog,  K.  0.  liu  timet  ion  and  testing  of  simplex  models.  British  Journal,  of 
Mathematical  ami  Statistical.  Psychology,  1970,  2.3,  121—145. 

Kennedy,  R.  S .  ,  Is  Bittner,  Jr.,  A.  C.  The  development  of  a  Performance.  Evaluation 
Test,  fur  Environmental  Research  (PETER)  .  in  Productivity  enhmicemc.pt  in 
Navy  systems.  San  Diego ,  California:  Naval  Personnel  Research  and  Develop¬ 
ment  Center,  October,  1977. 


32 


m.  » 
si*  : 


Koimeily,  K.  ...  i  HOH,  -lx.,  A.  0.  »*  of  complex  k«« 

iriv  ext.oipicd  pertodo:  Application.  for  .tolico  .1  eovi """Intel  .««• 
Pronontd  ox  tto  tow*  Medical  Atooclntlon,  »»  0rt**». 

May,  1978  (printed  In  Preprints. ) .  00 

to„„ly,  ».  ...  A  11 1  finer ,  lx.,  A.  0.  Pro*™.  .»  tto  to.1^1.  of  . 

foot  foe  Kiwi rimmental  Retard.  (PKTf’.K) ■  hpceJcUo^  qL 

vm  AtotoiSAXiMLat  ■*>.  vamJxaay  *s»  s““  c‘'llfl,r"10' 

1978.  (b) 

„.  K.  itatlyorlato  .o^tn-«L»thoto.  »»«  ».*>  1%7' 

mooe,  ii.  .i.  stotttotaa  pn«Mmi«  smfSmmJa^e.  <«•»*' 

New  York:  Mel’.raw-M 11  ,  1971. 


Table  2 


Analysis  of  variance  for  general  change  in  the  last  ten  days  of 


Cr  1 1 lea 1 

Tracking  relative 

to  the 

first  five  days  on 

the  same  task. 

Hour  e.e 

SS 

df 

MS 

F 

Rows 

1.6  24 

6 

0.406 

58.0  * 

1.  Ini’. a  r 

1.402 

1 

1.402 

200.3  * 

non  1  I  near 

0.222 

3 

0.074 

10.6  * 

Co  1  limns 

0.2.1) 

9 

0.024 

3.4  * 

'  1  non r 

0.007 

J. 

0.007 

0.9 

non  1  Inear 

0.206 

8 

0.026 

3.7  * 

Ron  Ulna  1 

0 . 25  1. 

36 

0.007 

_ _ _ 

To  ta  1 

2 . 08  7 

49 

—  -  - 

— 

ign if leant  at  the  .01  level 


Correlations  among  days  6  through  15  on  Critical  Tracking 


Table  4 


Sources  of  variation,  sums  of  squares,  and  decrees  freedom  for  the 
analysis  of  a  correlation  matrix  into  between  and  within  d Lagonal 
components . 


Degr ee.s_ of  freedom 


Source. 

SS 

n 

n=0 

Between  diagonals 

t  O-i 

8 

Within  diagonals 

*'%»  i*' 

A 

36 

To  tal 

f  r  O;  -Jt) 

_  l 

A 

44 

37 


Table  5 

Sources  of  variation,  sums  of  squares,  and  degrees  of  freedom 
for  the  analysis  of  a  correlation  matrix  into  diagonal- comparison 
and  residual  components. 


Decrees  of  jree.dojn 


Source 


Diagonal  eo nip  a  r  Isons 


Residual 


parisous  2l  W  *' ♦»  /  *  ^ 

IfL-xt-ts 


*  *  t 


Vl-a  * -I 


*l>-  -* 


To  tal 


»'SI 


W(  v\~  I 
R 


39 


Table  7 

Analysis  by  diagonal  comparisons  for  the  correlations  amor.)’  the 
first  five  days  on  Critical  Tracking. 


Source  of  Variation 

SS 

d£ 

MS 

F 

Diuguuu 1  compar  Lsou 

0.0286 

3 

0.0095 

2.79 

diagonal  1 

0.0023 

1 

0.0023 

0.68 

diagonal  2 

0.0150 

1 

0.0150 

4.4,1 

diagonal.  3 

0.0113 

1 

0.0113 

3.32 

Linear  trend 

0.3322 

3 

0.1 107 

32.56 

diagonal  1 

0.2184 

1 

0.21 84 

64 . 24 

diagonal  2 

0.0800 

1 

0.0800 

2  3.53 

diagonal  3 

0.0338 

1 

0.0338 

9 . 94 

Hen  I  dual 

0.0101 

3 

0.0034 

..  -  - 

Totn  1 

0.3709 

9 

1 

I 

i 

» 

4 

H 


Table  8 


Diagonal  statistics  f  or 

tin’  Last  four  days 

on  Critic 

al  Track  inj;. 

Diagonal  (j)  (  ^  ) 

.  r4 

R*.V( 

SS<V 

1  9 

0.89  7* 

0.767 

0.087 

0.0974  9 

8 

O.H24 

0.79 1 

0.073 

0.0331 

■)  7 

0.788 

0.740 

0.043 

0.0097 

b 

0.742 

0.799 

0.009 

0 . 0000 

')  9 

0.798 

0.790 

0.028 

0.0026 

(.  4 

0.7  99 

0.727 

0 . 008 

0.0002 

/  3 

0.79  7 

0.697 

0.060 

0.0094 

k  2 

O.b'M! 

0.71 0 

-0.020 

0.0003 

Table  9 


Analysis  by  diagonal  compart sons  for  the.  last  ten  days  on  Critical  Tracking, 


Source  of  Variation 

SS 

df 

MS 

]■’ 

1)1  a  go  n  a  1  c  omp  a  r  1  s  on s 

0.1058 

B 

0.01 32 

0.88 

diagonal  1 

0.0545 

1 

0 . 0545 

3.6  3 

d  iagona  1.  2 

0.0331 

1 

0.0331 

2.21 

diagonal  1 

0.0097 

1 

0.0097 

0.65 

d  laguna. 1.  4 

0 . 0000 

I 

0.0000 

0.00 

d  iagonal  5 

0.0026 

1. 

0.0026 

0.17 

diagonal  6 

0.0002 

1 

0.0002 

0.01 

diagonal.  7 

0.0054 

1 

0.0054 

0 . 36 

d.la.gonal  8 

0.0003 

1 

0.0003 

0.02 

Linear  trend 

0 .  1  086 

8 

0 . 1)  1  36 

0.91 

diagonal.  1 

0.0068 

1 

0.0068 

0.45 

d iagonal  2 

0 . 0001 

1 

0.0001 

0 . 0 1 

diagonal  1 

0.0066 

1 

0.0066 

0.44 

diagonal  4 

0.0082 

1 

0.0082 

0.55 

diagonal  5 

0.0078 

1 

0.0078 

0.52 

diagonal  6 

0.0065 

1 

0.0065 

0 . 4  3 

diagonal  7 

0 . 0004 

1 

0.0004 

0.03 

diagonal  8 

0.0722 

1 

0.0722 

4.83 

Residual 

0.4199 

28 

0.0 i 50 

Tot  a  1 

0.6343 

44 

Table  10 


Analysis  of  variance  for  general.  change 

relative  to  the  first;  five  days  on  the 

% 

In  the  last 

name  task. 

ten  days  of  ( 

lode  Response. 

Source 

si; 

df 

MS 

V 

Rows 

0.380 

4 

0.095 

5.94 

1 invar 

0.312 

1. 

0.312 

19 . 50 

nonl Inear 

0.068 

3 

0.023 

1.44 

do  1  unm.s 

o.  m 

9 

0.015 

0.92 

1  1 ne a r 

0.0(3 

1 

0.008 

0 . 50 

non  1  Incur 

Residual 

0. 12') 

0.56  3 

8 

36 

0.016 

0.016 

.1. .  00 

•91 


44 


Table  12 


Average  correlations  between  days  7  through  15  and  days  1  through  6  on 
Grammatical  Reasoning,  organized  by  the  former. 


Last  nine  days  Average  correlation  with  first  nix  days 


7 

0.708 

a 

0.715 

9 

0. 707 

10 

0.740 

1 1 

0.478 

1.7. 

0.708 

1.7 

0 .  b()8 

1.4 

0.775 

1.5 

0.552 

45 


Table  13 


Analysis  of  variance 

Reasoning  relative  to 

for  general  change 

the  first,  six  day: 

iii  days 

a  on  the 

7  through  14  on 

same  tank. 

Grammatical 

Source 

SS 

df 

MS 

y 

Rows 

0.6484 

5 

0.129/ 

68.3* 

linear 

0 . 5876 

1 

0.5876 

309 . 3* 

non! Inear 

0.0608 

4 

0.0157 

8.0* 

Column 

0.0634 

7 

0.009! 

4 . 8* 

.linear 

0.0004 

:i 

0.0004 

0.2 

nonlinear 

0.0630 

(> 

0.0105 

5.5* 

Res  t  dual 

0.064') 

35 

0.0019 

Total 

0.  7/67 

47 

*81  gill  I' leant:  at:  the  .01  level. 


Diagonal.  ;iv<'r;i|',i'.'i  tor  'lays  7  through 

Diagonal  (.]) 


'1 

4 

a 

0 


46 


Lr  1.4 

14  on  Grammatical  Reasoning. 

0:1)  ^ 


7 

(> 

5 

4 

1 

1 


0.880 
0.878 
0 .  B74 
0.872 
0.860 
0.0.10 


I 


Table  15 


Analysis  of  variance  for  general  change  in  the  last  ten  days  of  Arithmetic 
relative  to  the  first  five  days  on  the  same  task.  The.  measure  is  "number 
attempted. " 


Source 

SS 

df 

MS 

F 

Rows 

0.0176 

4 

0.0044 

9.8* 

1  l.near 

0.0137 

1 

0.0137 

30.4* 

nonlinear 

0.0039 

3 

0.0013 

2.9** 

Columns 

0 .00b6 

9 

0.0010 

1.4 

linear 

0.0030 

1 

0.00.30 

6.7** 

nonlinear 

0.0056 

B 

0.0007 

l  .6 

Re.nl  dual 

0.0161 

36 

0.00045 

Tot  a  1 

0.0423 

49 

— 

* 

Significant  at  the 

.01 

level . 

A  A 

Significant:  at:  the 

.05 

1  eve.  1 . 

Table  16 


Analysis  c£  variance  for  general  change  in  the  last  ten  days  of  Arithmetic 


relative  to  the  first 

correct." 

five  days  on  the 

same  task. 

The  Treasure  is 

"number 

Source 

SS 

df 

MS 

F 

Hows 

0.015)4 

4 

0.0038 

10.0* 

linear 

0.0110 

1 

0.0110 

2,8.5* 

non  1.1  near 

0 . 0044 

3 

0.0015 

3.8** 

Co  1 umn 

0.0102 

9 

0.0011 

2.8** 

Linear 

0.0029 

1 

0.0029 

7 . 5* 

nonlinear 

0.0073 

8 

0.0009 

2.4 

Residual. 

0.0139 

36 

0.0004 

— 

Total 

0 . 0  39  5 

49 

A 

** 


Significant  at  the  .01  level. 
Significant  at  the  .05  level. 


Source 

SS 

Rows 

0.5252 

linear 

0.4225 

nonlinear 

0.1027 

Columns 

0.0703 

linear 

0.0052 

nonlinear 

0.0651 

Residual 

0.0919 

Total 

0.b874 

*Signiflcant  at  the  .01  level 


49 


Table  17 

change  in  the  last  ten  days  of  the  Stroop 
days  on  the  same  task.  The  measure  is 


df 

MS 

F 

4 

0.1313 

50.5* 

1 

0.4225 

162.5* 

3 

0.0342 

13.2* 

9 

0.0078 

3.0* 

1 

0.0052 

2.0 

8 

0.0081 

3.1* 

36 

0.0026 

49 


50 


Table  18 

Analysts  of  variance  for  general  change  In  Hie  last  ten  days  of  the  S troop 


Test  relative  to  the 

"colored  words." 

f  i  rs  t  f  i  ve 

days  on  the 

same  task .  The  measure 

Is 

Source 

SS 

df 

MS 

F 

Rows 

0.7017 

4 

0.1759 

83.8* 

.1  i  near 

0.5227 

1 

0.5227 

248.9* 

non!  incur 

0.1810 

5 

0.0603 

28.7 

tie  1  Minns 

0 . 1  360 

9 

0.0151 

7 . 2* 

.Linen  j 

0.0000 

.1 

0.0000 

0.0 

non  1 l near 

0 .  1.360 

8 

0.0170 

9.0* 

Rea  Ldual 

0.0767 

36 

0.0021 

Total 

0.9165 

49 

*Si gnl f ! rant  at  the  .01  level. 


I 


I 


52 


Table  20 


Analysis  of  variance  for  days  0-9  on  Two-Dimensional  Tracking  relative  to 


days  1-5  on  the  .same  task. 


Source 

SS 

Rows 

0.2.262 

1  lunar 

0 . 1836 

ti<  nli.near 

0.0426 

0o  l  unins 

0.0042 

1  Liiear 

0.0000 

nonlinear 

0.0042 

Residual. 

0.02.67 

To  t  nl 

0.2572 

*K  i  gnl  i  leant:  at  the 

.01  level. 

df 

MS 

F 

4 

0.0566 

25 . 7* 

.1. 

0.1836 

83.5* 

3 

0.0142 

6.5* 

3 

0.0014 

0.6 

1 

0.0000 

0.0 

2 

0.0021 

1.0 

12 

0.0022 

— 

19 

53 


Table  21 


Correlations  between  days  6-9  and  days  10-15  on  Two-Mmens tonal  Tracking. 


Day 

10 

11 

12 

13 

14 

15 

r 

6 

.71 

.39 

0.64 

.38 

.21 

.21 

0.423 

7 

.84 

.62 

.77 

.  46 

.30 

.42 

O 

Ul 

GO 

8 

.  77 

.52 

.71 

.34 

.07 

.27 

0.447 

9 

.70 

.77 

.78 

.37 

.12 

.34 

0.513 

r 

0.755 

0.575 

0.725 

0.388 

0.175 

0 . 310 

0.488 

'  22 

in  the  last  ten  days  of  Complex 
i  on  the  sumo  task. 


df  MR  F 

4  0.0853  27.! 

I  0.2948  9r).! 

3  0.01 ‘>4  5.< 

9  0,01.42  4.1 

I  0.0814  2b. 

8  0.0058  1 

lb 
45 


0.003.1 


Table  23 


Analysis  of  variance 

for  general  change 

in  the  last 

five  days  of 

Comp lex 

Counting  relative  to 

the  first:  ten  days 

on  the  same 

task. 

Source. 

SS 

df 

MS 

F 

Rows 

0.2827 

9 

0.0314 

10.1* 

linear 

0.0682 

1 

0.0682 

22.0* 

nonlinear 

0.2145 

8 

0.0268 

8.6* 

Columns 

0.0515 

4 

0 . 0219 

4 . 2* 

linear 

0.0437 

1 

0.0437 

14.1* 

nonlinear 

0.0078 

3 

0.0026 

0.8 

Residual 

0.1118 

36 

0.0031 

Total 

0.4460 

45 

^Significant:  at  the  .01  level 


Table  24 


-iR\ 

1.1* 

I 

!i 


f 

!. 


I 


X 


56 


Avera>',o  correlations  of  the  first  ten  days  on  Time  Kst  tiuation  with  the 
last  l  ive  days  on  the.  same  task. 

Day  Average  correlation  with  last  five  days 


I 

a 

I 

4 

6 

(i 

7 

8 

>1 

HI 


-0.180 
11)  .068 
-0.118 
-0.014 
40.212 
•I  0.522 
1 0.45 A 
HO.  522 
40.474 
10.740 


Vi li  -..L.—  ... 


Table.  25 


era!.  i-.luui|;<i  in  the  last:  five  (lays  of 
the  first  ten  days  on  the  same.  task. 


df 

MS 

V 

9 

0.5149 

54.8  * 

1 

3.9646 

421.8  * 

8 

0.08 3 7 

8.9  * 

4 

0.0709 

7.5  * 

1 

0.1.444 

15.4  * 

3 

0.0459 

4.9  * 

36 


0.0094 


Tab  1 1'  26 


1)  f  amnia  1  jit.atl.al. 

f f.s:  for  tin' 

last  f  Ivo  clay: 

:  in  Tilin' 

Rut  limit I  cm. 

)).l.ay,onnl  (j) 

(n  .l ) 

_CJ  .. 

bS(f:A) 

1 

6 

0.  790 

0.b7() 

0.120 

0.0366 

x 

0.  73. ’3 

0.60/ 

0.  L2<> 

0.D73H 

'1 

2 

0.76  0 

0.360 

0.600 

0,1067 

Tali  I  e  27 


Analysis  by  diagonal  comparisons  lor  the  last:  five  days  of 
Time  Es t: ima t.iun . 


Source  of  Variation 

SS 

df 

MS 

Diugona 1  comparisons 

0.1631 

3 

0 . 03  30 

diagonal  1. 

0.034b 

1 

0,0346 

(.1  tagrma  1.  2 

0.0238 

1 

0.0238 

diagonal.  3 

0.1 067 

1 

0.1067 

linear  trend 

0.2  2 .34 

3 

0.0/31 

diagonal.  1. 

0.0 372 

1 

0.03/7 

diagonal.  2 

0. 1682 

1 

0.1  (.82 

diagonal  3 

0. 07.00 

1 

0.0200 

Res  Ideal 

0.0431 

3 

0.0180 

Tot:  a  1. 

0.4336 

0 

